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Abstract 

We develop two analytic lower bounds on the probability of success p of identifying 
a state picked from a known ensemble of pure states: a bound based on the pairwise 
inner products of the states, and a bound based on the eigenvalues of their Gram 
matrix. We use the latter to lower bound the asymptotic distinguishability of ensembles 
of n random quantum states in d dimensions, where n/d approaches a constant. In 
particular, for almost all ensembles of n states in n dimensions, p > 0.72. An application 
to distinguishing Boolean functions (the "oracle identification problem" ) in quantum 
computation is given. 



1 Introduction 

A fundamental property of quantum mechanics is that non-orthogonal pure quantum states 
may not be distinguished perfectly. This leads to the following quantum detection problem: 
given an unknown quantum state |V'?)> picked from a known set £ with known a priori prob- 
abilities, find the "optimal" measurement M op * to determine Several different criteria 
for optimality may been considered |12l |5J E|; here we only concern ourselves with opti- 
mising the probability of success and in particular the related state distinguishability 
problem of finding P opt without necessarily finding M opt . Efficient optimisation techniques 
can be used to estimate P°p* numerically [7]; however, the problem of finding an analytic 
expression for P°p* seems intractable. We are therefore led to attempting to produce bounds 
on P opt . 

This note derives two lower bounds on one based on the pairwise distinguishability 
of the states in S , and one based on the eigenvalues of their Gram matrix. We use the 
latter, and a powerful result from random matrix theory (the Marcenko-Pastur law |18|). to 
bound the probability of distinguishing a set of random quantum states, for a quite general 
notion of randomness. This has an application to quantum computation in the so-called 
oracle identification problem introduced by Ambainis et al where we are given an ra-bit 
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Boolean function / picked from a known set of N functions, and must identify / with the 
minimum number of queries to /. We show that, for all but an exponentially small fraction 
of sets with N = 2 n , a quantum computer can perform this task successfully in a constant 
number of queries (with arbitrarily high probability) , whereas classical computation requires 
n queries for all such sets. 

As showing that a set of quantum states are quite distinguishable forms an essential 
part of proofs in many areas of quantum information theory, we hope that these results will 
find application elsewhere. 

The organisation of the paper is as follows. Section 2 introduces notation and our main 
tool, the so-called "pretty good measurement" , before moving on to give the lower bounds 
on P°p*. An extension of the lower bounds to mixed states is considered. Section 3 applies 
the bounds to a specific family of ensembles (those where all the states have constant inner 
product). Section 4 describes the random matrix theory we will be using, and applies it 
to the distinguishability of random quantum states. Section 5 gives the application to the 
oracle identification problem, and the paper closes with some discussion in section 6. 

2 Bounds on the distinguishability of quantum states 

We consider an ensemble £ containing n <i-dimensional pure states with their a priori 
probabilities pi. We will use to denote the set containing the same states, renor- 

malised to reflect their probabilities (i.e. |^) = y/pltyi))- Given an unknown state IV'?), 
picked in accordance with these probabilities, the quantity we are interested in is the av- 
erage probability of success for a given generalised measurement to distinguish which state 
we were given. For a measurement M (given by a set of positive operators {Mi} summing 
to the identity), let this probability be denoted by P M {£). Then we have 



M opt (£) will denote the measurement with the optimal probability of success, and in an 
abuse of notation P opt {£) will denote this optimal probability. We call this the optimal 
probability of distinguishing the states in £. 



denotes the i'th singular value of A. We will often use the d x n state matrix S = S(£) = 
(|V4), IV'n)) whose i'th column is the state Then G = S^S gives the n x n Gram 

matrix ^1] encoding all the inner products between the renormalised states in £ . If n < d, 
G will have d — n zero eigenvalues. Note that every rectangular matrix M with ||M||2 = 1 
is a state matrix, p will represent the density matrix of the ensemble: 






n 




(2) 



i=l 

It is well-known ^S] that G and p have the same non-zero eigenvalues. 
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2.1 Use of the "pretty good measurement" 

We will use a specific measurement to provide bounds on P opt (£), which is "canonical" in 
the sense that it performs reasonably well for any ensemble £. This is the so-called pretty 
good measurement (PGM), which was independently identified by several authors (e.g. (§], 
[TU] ) and has a number of useful properties. It is usually defined as a set of projectors 
{l^iX^il} onto "measurement vectors" where = p -1 / 2 ^) (the inverse only being 
taken on the support of p). However, it may also be defined implicitly, which brings out its 
"canonical" nature. 

To this end, consider an arbitrary measurement M for £ that consists of a set of n rank 
1 projectors onto unnormalised measurement vectors \/J,i), where each measurement vector 
corresponds to a state in the ensemble. (In fact, it turns out that the optimal mea- 
surement for an ensemble of pure states always falls into this category [7j.) The probability 
of getting measurement outcome i and receiving state j is then |(//j|^)| 2 , and the overall 
probability of success of this measurement is J21=i I (^iWd \ 2 ■ We may thus encode all the 
inner products (and hence the probabilities) in a matrix P, where Py = (/J,i\ipj); and rather 
than looking for an optimal measurement M, we can rephrase our task as looking for an 
optimal matrix P that corresponds to a valid measurement. 

We have the following requirement on P, from the fact that M must be a valid POVM. 



(pt P)ij = Y^m^kWj) = (4\ E l/*fc><A**l Wj) = Gij = (SflS^ (3) 
fc=i \fc=i / 

A natural way to produce a matrix P that satisfies this condition from any given S is to 
take P = VG, the positive semidefinite square root of G. The PGM turns out to be a 
measurement corresponding to this matrix P, for, if Pij = {vi\ip'-), then 



(p% = E^l^ 1/2 l^)^lp- 1/2 l^-) = (4\ p- 1/2 E lv4X^lp- 1/2 Wi) = Gij (4) 

k=l V k=l / 

The probability of success for the PGM is thus given by PP9^(£) = Y^i=i{VG)l. Barnum 
and Knill have proved j3j that the PGM has the further property that it is almost optimal 
in the following sense. 

Theorem 2.1. (Barnum, Knill) [3J pP9 m (£) > P°p\£) 2 . 

So there is the overall relationship P°p*(£) 2 < PP9™(£) < P°p*(£). For completeness, 
we include (in Appendix ^J) a simplified proof of Barnum and KnilPs result in the case of 
pure states. 

2.2 Bounds from the pairwise inner products 

A set of states that are pairwise almost orthogonal are pairwise almost distinguishable. It 
thus seems intuitively clear that, given such a set, the probability of success in distinguishing 
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one state from all the others must also be high. However, this intuition is wrong. This was 
noted by Jozsa and Schlienz ^21, who showed that the inner products of an ensemble of 
states may all be reduced, while simultaneously reducing the von Neumann entropy of the 
ensemble (which gives a measure of overall distinguishability) . This effect also manifests 
itself in quantum fingerprinting [3]. Here, (i-dimensional states are "compressed" to logd- 
dimensional "fingerprint" states that can be distinguished pairwise. However, given such 
a fingerprint the corresponding original state may not be identified, as this would violate 
Holevo's theorem |13j . 

Nevertheless, for certain ensembles the pairwise inner products can give a good lower 
bound on the overall distinguishability, as noted by several authors [HlEl- I n this section, 
we derive such a bound. Our approach is based on that of Hausladen et al. who found 
a parabola forming a lower bound on the square root function, which is useful because of 
the following lemma. 

Lemma 2.2. If the function \fx is bounded below by f(x) = ax + bx 2 for x > 0, then 
> oG^i + 6 E?=il^ | 2 . 

Proof. G is a positive semidefinite matrix and thus may be diagonalised: G = UDW, 
where D = diag({\i}) and U = (v,ij) is unitary. Working out the matrix algebra shows 
that (VG) u = £Li\/^M 2 > so > £Li/(Afc)M 2 = f(G)». But f(G)u = 

(aG + bG 2 )u = aGu + &£?=! G^G^ = aGu + b£" =1 |Gy| 2 . □ 

Our goal will be to find a and b to parametrise / such that aGu + &£™=i \Gij\ 2 is 
maximised. It is clear that, for this to be maximised, f(r) must equal yfr for some r (or we 
could just increase a or b). So we will pick a and b such that f(r) = \fr and f[f) = 
(i.e. the curves are tangent at this point). This leads to the simultaneous equations 

ar + br 2 = \/r, a + 2br = — = (5) 

2yjr 

Solving for a and b gives the optimal values 

a = ^ b = ~^ (6) 

To see that f(x) actually is a lower bound for yfx for any positive value of r (with these 
values for a and b), note that the only solutions to the related equation f(x) 2 = x are 
x = 0, x = r, or x = 4r. As /(4r) is negative, we have that f(x) = \fx if and only if 
x = or x = r. So the only remaining possibility is that fix) > \fx for all < x < r. 
Plugging in a suitable value of x (e.g. r/2) shows that this is not the case. The expression 
aGu + &£™=i \Gij\ 2 may now be expressed solely in terms of r. Optimising this for r gives 
that the maximum is found at the point 

£j=i 



En |2 
7 = 1 l^tfl 



r (7) 
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Returning to the original inequality, we have 

(Vg)« > ^g u - ^ £ icy 2 =* > (8) 

We thus have the following bound on the probability of distinguishing the states in £. 
If all the states have equal a priori probabilities, the bound simplifies further to 

P P9m (£) > - E ^ „ 2 (10) 

Unlike previous bounds obtained by other authors for the probability of success of the PGM 
[HI El , the bound @ is always positive and greater than or equal to Y^=iPh * nus showing 
that the PGM always does at least as well as the "non-measurement" of guessing which 
state was received in accordance with their a priori probabilities. 

2.3 Bounds from eigenvalues 

The eigenvalues of a Hermitian matrix are closely related to its diagonal elements; indeed, 
the former majorises the latter ^1]. With this in mind, we look for a bound on the unknown 
diagonal elements of \J~G in terms of the known eigenvalues {Aj} of G. 



Lemma 2.3. P™™{E) > I (£f=i ^ = \\\Sf tr . 

Proof. By the fact that the trace of a matrix is the sum of its eigenvalues, we have 



j2(Vg) u =J2^ en) 

8=1 1=1 
/ n \ 2 / n \ 2 

£(^k = Ev^ a 2 ) 



\j=i / \j=i / 

n / n \ 2 

n 

i=l \i=l 

pP9 m {£) > i nc^J 



E(^)l > Ev^ (13) 



(14) 



where in (fT3)) we used a Cauchy-Schwarz inequality, showing that equality can only be 
attained in step (fT3)) when all the (vG)u are equal. □ 
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Interestingly, this bound is the same as the fidelity of G with the maximally mixed state 
I/n, where the fidelity F(p, a) is defined as (tr V ' P l/2 o P 1/2 ) Hi- 
lt is worth noting that no upper bound on the success probability in terms of the eigen- 
values alone can be found, for the following reason. Any set of eigenvalues {Aj} summing to 
1 can give rise to a Gram matrix G where Gu = Aj, and Gij = (for i ^ j). Such matrices 
correspond to an ensemble £ of perfectly distinguishable states where pP9 m (£) = 1. As 
future work, it would be interesting to determine whether an upper bound (or an improved 
lower bound) could be produced by considering the diagonal entries of G as well as its 
eigenvalues. 

2.4 Distinguishing mixed states 

It is natural to ask to what extent these lower bounds hold for the generalised problem of 
distinguishing an ensemble £ consisting of mixed states {pi}. The following lemma allows 
the problem to be related to that of distinguishing pure states. 

Lemma 2.4. Let £ be an ensemble of n d- dimensional mixed states {pi} with a priori 
probabilities {pi}, and having spectral decompositions pi = Ylt=i ^ik\ v ik) ( v ik\ ■ Let T be an 
ensemble of the nd pure states given by the eigenvectors {(fife)} with a priori probabilities 
{Pi\ lk }. Then pP9 m (£) > pP'J m (F). 

Proof. For mixed states, the PGM is defined by the following measurement operators {Mi}: 

n 

Mi = p~ 1/2 p'iP~ 1/2 , where p\ = piPi and p = ^ p\ (15) 

i=l 

So the probability of success can be bounded as follows, where we use the renormalised 
eigenvectors \v' ik ) 



□ 

Therefore, if the eigenvalues and eigenvectors of the states {pi} are known, the lower 
bounds given previously may be applied. If not, a weaker lower bound based only on 
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/PiV>Hk\vik) ■ 



£tr( P -vy tP -V2^ 

i=l 

n / / d 

i=i V Vfe=i / Ki- 
ri d 



4=1 k,l=l 
n d 



n d 



(16) 
(17) 
(18) 



EEK^i^ 1/2 K7)i 2 ^EEK^i^ 1/2 K fe )i 2 = ppsm (^) ( 19 ) 



4=1 fcJ=l 



4=1 k=l 



the pairwise fidelities of the states may be given (where, as before, we set F(p, a) 

2 



tr VpV2apV2) ) 



Theorem 2.5. Let £ be an ensemble of n d- dimensional mixed states {pi} with a priori 
probabilities {pi}- Then 

P P9m (£) > £ ^n P ' tr( fP , (20) 

Proof. From the bound © and Lemma 12,41 we have 

n d 2\2 
ppgm^ > Pi ik (21) 

i= i k =i Ej=iEz=i^^ 7 |(^|^ 7 )| 2 



n d 2 \ 2 

pj\ 2 



yy x — (22 

«=i fc=i Li=iPj(^ifcl [22i=i x ji\ v j 



n d „2\2 



> 



EEv^rm ( 23 ) 



□ 

This bound gets progressively worse as the states in £ get more mixed. One might 
expect the following lower bound to hold for mixed states, as it is the obvious extension of 
the bound @ for pure states, but interestingly it does not. 

n 2 

p ' n£)2 gErfk^y (25> 

A simple counterexample is given by the equiprobable ensemble consisting of the following 
two three-dimensional states. 

±oo\ /I 00 



Pi = o i o , P2 = o o o (26) 

Vooo/ Vooiy 

3 The distinguishability of states with constant inner prod- 
uct 

An illustrative case to apply these bounds to is that of equiprobable states where the 
pairwise inner products are all equal, so the states are all equally distinguishable from each 
other. Consider an ensemble £ with Gram matrix G, where Gu = 1/n and Gij = p/n for 
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i 7^ j (and p is a positive real constant). In this case, the inner product bound of section 
12,21 gives the bound 



P p9m (£ ) > ~r~~ 2? TT = 0(l/n) (27) 

1 + p z [n — 1) 

The eigenvalue bound, however, gives much better results. The symmetry of G shows 
immediately that it has an eigenvector (1,1,.. .,1); the corresponding eigenvalue is Ai = 
p + (1 — p)/n. The set of eigenvectors may be completed by taking any n — 1 vectors 
orthogonal to (1, 1, 1), which will be eigenvectors with eigenvalues A2...71 = (1 — p)/n. We 
therefore have 



> 

so the probability of distinguishing these states approaches a constant as n — » 00. In fact, 
one can show that inequality ((2*5)) is actually an equality giving the precise probability of 
success pP# m (£) (this follows from showing that the diagonal entries of \[G are all equal). 

Such an ensemble therefore provides a kind of converse to the ensemble of states used in 
quantum fingerprinting ["fj: in this case, no matter how many states there are in the ensem- 
ble, their joint distinguishability is of the same order as their pairwise distinguishability. 
We will see below that this behaviour is not typical; however, it is perhaps not surprising, 
because £ can only be realised in n dimensions. To see this, note that G is non-singular, so 
the states in £ must be linearly independent. 



1 



P + 



n 



-('» 

n 



(1-p) 



n 



> (1-P) 



2(1 -P) 



n 



(28) 
(29) 



4 The distinguishability of random quantum states 

We will use Lemma 12.31 and some results from the theory of random matrices to put a lower 
bound on the probability of distinguishing random quantum states. The expected value of 
this lower bound will be obtained for a quite general notion of "randomness" , but in order 
to get measure concentration results we will specialise to states distributed uniformly at 
random (according to the Haar measure). The results hold in the asymptotic regime where 
the number of states n and the dimension d approach a constant ratio. 

4.1 A little random matrix theory 

In this section, we will calculate the expected value of the trace norm of a random matrix. 
The distribution of the trace norm (i.e. the sum of singular values) of a matrix M is clearly 
related to that of the eigenvalues of the matrix MM^ , which is known to statisticians as a 
(complex) Wishart matrix. The distribution of the eigenvalues of a Wishart matrix is given 
by the Marcenko-Pastur law |18j . which is stated in the form we need in 
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Theorem 4.1. (Marcenko/Pastur law) | 18 | 

Let R r be a family of d x n matrices with n > al and d/n — > r £ (0, 1] as n, d — > oo, where 
the entries of R r are i.i.d. complex random variables with mean and variance 1. Then, 
as n, d — > oo, the eigenvalues of the rescaled matrix ^R r Rt tend to a limiting distribution 
with density 



V(* -*)(*■-«) (30) 
znrx 

for A 2 < x < B 2 (where A = 1 — y/r, B = 1 + and density elsewhere. 

We will translate this to a similar statement about the singular values of R r . The 
following lemma is straightforward. 

Lemma 4.2. Let R r be a family of d x n matrices with k/m — » r G (0, 1] as n,d ^ oo, 
where k = min(n, d) and m = max(n, d), and i/te entries of R r are i.i.d. complex random 
variables with mean and variance 1. Then, as n,d — > oo, t/ie singular values of R r /y/m 
tend to a limiting distribution with density 



iv(y) = — (3i) 

for A < y < B (where A = 1 — y^r, 5 = 1 + \/r), and density elsewhere. 

Proof. The lemma follows from Theorem 14. II for n > d by substituting y = yfx. For n < d, 
note that the singular values of R are the same as those of R T , so the roles of n and d need 
merely be interchanged. □ 

Lemma 4.3. Let R r be a family of d x n matrices with k/m — > r G (0, 1] as n,d ^ oo, 
where k = min(n, d) and m = max(n,d), and the entries of R r are i.i.d. complex random 
variables with mean and variance 1. Then, as n, d — ► oo, the expected trace norm of R r is 

n\\Rr\\tr) = / ^{y 2 -A 2 ){B 2 -y 2 )dy (32) 

TT J A 

where A = 1 — y^r, 5 = 1 + y 7 ? . 

Proof. With probability 1, R r will have k non-zero singular values. Let ai(R r ) denote the 
value of the i'th (unsorted) singular value of R r , for arbitrary i between 1 and k. We have 

f B 

E(\\R r \\ tr ) = (kVm-)E(o- i (R r /Vm~)) = kV™ : yp r (y)dy (33) 

J A 



and using Lemma 14.21 gives the desired result. □ 

This turns out to be an elliptic integral which cannot be expressed in terms of elementary 
functions [S|. However, it is possible to produce a good lower bound, which is tight in the 
case r = 1: 
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Lemma 4.4. 

E(||i? r || tr ) > kyfr^l-r(l--^ (34) 

with equality when r = 1. 

Proof. See Appendix El D 
4.2 Random quantum states 

Knowing the expected value of the trace norm immediately allows us to say something 
about the expected distinguishability of an ensemble of random quantum states, for a quite 
general notion of randomness. 

Theorem 4.5. Let £ be an ensemble ofn equiprobable d- dimensional quantum states 

with n/d — ► r £ (0, oo) as n,d — > oo, and let the components of in some basis be i.i.d. 

complex random variables with mean and variance 1/d. Then 

e(p»™ (£ )) > { i t~({-%f )] a ™L < 35 > 

and in particular E(P ram (£)) > 0.720 when n < d. 

Proof. The matrix R = \fndS{S) fulfils the criteria for the Marcenko-Pastur law (|4.1jl . as 
its entries are complex random variables with mean and variance 1. We therefore have 



E (FW»(£)) > E ( l -\\S{8)\A > ±-E(\\S(£)\\tr? = -^E(||i?||, r ) 2 (36) 

y I V /ft f h LL 

and plugging in the lower bound on the expected trace norm of R from Lemma 14.41 gives 
the required result. □ 

We can immediately apply this result to the distinguishability of random quantum states 
uniformly distributed on the complex unit sphere in d dimensions. A uniformly random 
quantum state may be produced by creating a vector v, each of whose components are 
complex Gaussians (say Vi ~ N(0,l/d)), and normalising the result. By the law of large 
numbers, as d — > oo, the norm of the resulting vector will approach 1, so the normalisation 
step becomes unnecessary. (This can be formalised and is known as Poincare's lemma jlOj.) 
Therefore, an ensemble of uniformly random states meets the criteria for Theorem 14. 5( so 
we can lower bound its expected distinguishability. 

In fact, in this case, we may exploit the concentration of measure effects characteristic 
of high-dimensional spaces to show that for high d almost all ensembles of n < d states are 
quite distinguishable. As with the recent paper j2Uj, our tool will be Levy's Lemma |17j : 
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Lemma 4.6. (Levy's Lemma) [17j 

Given a function f : S d i— > M defined on the d- dimensional real hyper sphere S rf , and a point 
p on the hypersphere chosen uniformly at random, 



Pr[|/(p) -E(/)| > e] < 2exp ( (37) 

where r\ is the Lipschitz constant of f , rj = sup x y \ f(x) — f(y)\/\\x — y\\2, and C is a positive 
constant that may be taken to be l/(187r 3 ). 

This is useful for us because a state matrix is precisely such a point on a hypersphere: 

Lemma 4.7. Let £ be an ensemble of n equiprobable d-dimensional quantum states picked 
uniformly at random. Then, for large d, the state matrix S(£) defines a point picked uni- 
formly at random on the sphere in nd complex dimensions (equivalently, the real sphere 
g2na!-i ^ n 2 na ] dimensions) . 

Proof. As noted previously, by the properties of quantum states distributed uniformly at 
random, for high d the elements of S{£) will be complex Gaussians with mean and variance 
1/nd. The lemma follows. □ 

Lemma 4.8. Let S be an n x d matrix with \\S\\2 = 1, and define f(S) = ^||5 , ||j r . Then 
the Lipschitz constant n of f satisfies n < 2. 

Proof. See Appendix O □ 

Plugging this function / and this value of rj into Levy's Lemma gives the following 
theorem. 

Theorem 4.9. Let £ be an ensemble of n d-dimensional quantum states picked uniformly 
at random. Set p = E(P^ OT (f )) = \ (l - \ (l - |^)) if n > d, and p = 1 - r (l - ^r) 
otherwise. Then 



Fr[P P9m {£) < p _ e ] < 2 exp ( C ( 2n d+l)e 2 \ (38) 

where C = l/(18vr 3 ). 

Figure ^ shows numerical evidence that ensembles £ of quantum states picked uniformly 
at random appear to have a value of pP9 m (£) close to this lower bound, even when the states 
are (relatively) low-dimensional. 



5 Application to oracle identification 

The oracle identification problem may be defined as follows 1 . Given an unknown re-bit 
Boolean function / : {0, l} n i— > {0,1} (the oracle), picked uniformly at random from a 
known set F of functions, identify / with the minimum number of uses of /. Set = \F\ 



11 




(a) < r < 2 (b) < r < 10 



Figure 1: Asymptotic bound on _pP9' m (£) vs. numerical results (averaged over 10 runs) for 
ensembles of n = 50r 50-dimensional uniformly random states. 



and D = 2 n . Clearly, classical computation cannot identify / with fewer than log 2 N queries 
in the worst case (as each query may reduce the search space by at most half). However, 
quantum computation can sometimes do better. On a quantum computer, we can encode 
the oracle as an n qubit unitary operator Uf, defined by the action Uf\x) i— > (—l)^ x ^\x) . 
Now if the uniform superposition -^hr z2 x =o \ x ) ^ s i n P u t to the oracle, the following oracle 
state will be be produced: 

2™-l 

l^/> = 2^=1 EC" 1 )** !*) (39) 

x=0 

In some cases, a single quantum query to Uf may be enough to identify / with certainty. 
This will be the case if (ipf\ip g ) = for all f ^ g (although this is not a necessary condition). 
The satisfaction of this orthogonality condition may be expected to be a rare event, and is 
certainly impossible when N > D. However, if we are content with a small probability of 
error, the situation is better: we will show here that, in particular, almost all sets of N = D 
oracles may be distinguished almost certainly in a constant number of quantum queries. 

The oracle identification problem was introduced and studied by Ambainis et al , who 
(among other results) developed a hybrid quantum-classical algorithm for the random oracle 
case with which we concern ourselves here. However, the upper bound they obtained in the 
case where N = D is only 0(log 2 N) queries, which is no better than classical computation. 

Lemma 5.1. Let £ be an ensemble of N D-dimensional oracle states corresponding to 
Boolean functions picked uniformly at random (call these random oracle states Then the 
rescaled state matrix y/ND S(£) defines a point picked uniformly at random on the ND- 
dimensional hypercube {—1, 1} ND . 

Proof. Each component of each state will be ±l/y/ND, with equal probability of each. □ 
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y/ND S(£) therefore meets the required conditions for the Marcenko-Pastur law (|4.1|) . 
so we may say immediately 



Lemma 5.2. Let £ be an ensemble of N D- dimensional random oracle states, and set 
r = N/D. Then 



E(P»™(£))>( rV r{ A ^ i the - mse (40) 

and in particular E(P™ m (£)) > 0.720 when N < D. 

Like the sphere, the high-dimensional hypercube exhibits the concentration of measure 
phenomenon, and we can write down a similar result to Levy's Lemma |17j : 

Lemma 5.3. (Concentration of measure on the cube) [17J 

Given a function f : {—1, l} d i— > M defined on a d-dimensional hypercube, and a point p on 
the hypercube chosen uniformly at random, 

Pr[|/(p) - E(/)| > e] < 2exp (j^) (41) 

where r] is the Lipschitz constant of f with respect to the Hamming distance, r] = sup x y \f{x) — 
f(y)\/d(x,y). 

Lemma 5.4. Let H be a point on the nd- dimensional hypercube written down as an n x d 
{—1,1} -matrix, and let f(H) = -^\\H\\^ r . Then the Lipschitz constant r\ of f satisfies 
r] < A/nd. 

Proof. See Appendix [O □ 

Plugging this value of r) into Lemma 15.31 gives 

Theorem 5.5. Let £ be an ensemble of N D-dimensional random oracle states. Set p = 
E(pP9 m (£)) = i (1 - i (1 - J^)) if N > D, and p = 1 - r (l - J^) otherwise, where 
r = N/D. Then 

Pr[P P9m (£) < p - e] < 2 exp ( ^J^ ) ( 42 ) 

and we have our desired result: with 1 query, all but an exponentially small fraction 
of the possible sets of iV ^-dimensional random oracle states may be distinguished with a 
constant probability bounded away from 1/2 (in fact, to get a probability of success greater 
than 1/2, we may take r = N/D to be as high as ~ 1.66). A constant number of repetitions 
allows this probability to be boosted to be arbitrarily high. 
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6 Discussion 



This work can be seen as part of an overall programme of understanding the behaviour of 
random quantum states |2UJ [22] . 

There is a fundamental correspondence between the mixed state obtained from an equal 
mixture of uniformly random pure states, and that produced by starting with a larger 
system in a uniformly random pure state, and tracing out part of the system. Consider a 
(i-dimensional state 

1 n 

Pn,d = -Y J \i>i)^i\ (43) 
i=l 

where each state in the set £ = is picked uniformly at random. We can think of p n ^d 

as being produced from the following (in-dimensional state (which we consider to live in a 
Hilbert space TLd ® H n ) by tracing out the second subsystem: 

1 n— 1 -. n— 1 d— 1 

H = 7^ £ = 7^ EE^i*) ( 44 ) 

V k=0 V fc=0 «=0 

for some coefficients aki- As mentioned previously, the ctki will be approximately normally 
distributed as N(0, 1/d). So, because of the normalisation factor at the front of the sum, the 
overall state \v) has coefficients which are normally distributed and scaled as N(0,l/dn). 
Therefore, this state is picked from the uniform distribution on the unit sphere in C dn . 
Popescu, Short and Winter [20] obtained an upper bound on the expected trace distance 
of such a state p n ^ from the maximally mixed state I/d, and used this to show that for 
n S> d, p w I/d. 

Because the non-zero eigenvalues of the Gram matrix of (rescaled) states in 8 are the 
same as the eigenvalues of p n% & ^H], this paper can be seen as obtaining a similar result 
to |2U| for the fidelity of p n ^ with the maximally mixed state, via quite different methods. 
However, the bound is tighter for n close to d, and the notion of "randomness" of the states 
{IV'i}} is more general (which is simply a side-effect of relying on the powerful Marcenko- 
Pastur law). 
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Appendices 

A The PGM is close to optimal 

Theorem EH! (Barnum, Knill) [3j pP9 m (£) > P°p\£) 2 . 

Proof. Consider an arbitrary POVM R consisting of measurement operators {Ri}, and an 
arbitrary ensemble £ of renormalised states {IV^)}] with a priori probabilities pi, where as 
before |^) = -s/plfyi) and p = Y2i=i l^iX^I- Assume wlog that Ri = \fj,i){fj,i\ for some 
vectors \fii), as the optimal measurement will always be of this form 0. Then 



p r (£) = e^i^i^) = Ei^i^)i 2 = Ei^^ _1/ v /4 i^)i 2 



< 



< 



i=l 



i=l 



i=l 



/2 



IH) 



i=l 



£»- 1/2 i^> 2 ] (£^> 1/2 i^ 



■ \2 



\ i=i 



(45) 
(46) 

(47) 

(48) 



The first and second inequalities are Cauchy-Schwarz inequalities, and the third follows 
because the vectors {p 1 ^ 2 \pi)} can easily be seen to define an ensemble with density matrix 
P- 



Y,p 1/2 \^\p 1/2 = p 1/2 



^2\pi)(pi\ p l 



/2 



(49) 



8=1 



vi=l 



and we therefore have ^2 1 l = i(Pi\p 1 ^ 2 \Pi)' 2 < 1, as this is the probability of success of the 
measurement R applied to this ensemble. □ 



B Proof of Lemma 14.41 

In this appendix we will prove a lemma which immediately implies Lemma 14.41 See jH] for 
the facts used about elliptic integrals and hyper geometric series. 

Lemma B.l. Let < r < 1 and A = 1 — y/r, B = 1 + y/f. Then 



J* y/(y*-A>)(B2-y2)dy > mJl - r (l - 



(50) 



with equality at r = 0, r = 1. 
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Proof. We have 



f(r) = j B V(y 2 -^ 2 )(B 2 -y 2 )dy (51) 



where K{r) and E(r) are the complete elliptic integrals of the first and second kind, respec- 
tively: 

K {r) = f 1 dX = , E(r) = C ^ '^ dx (54) 

V ' Jo ^(l-x^l-r^x 2 ) W Jo 

Note that f(r) may be evaluated explicitly for r = and r = 1, giving and 8/3 respectively. 
Now we may apply a standard change of variables (Landen's transformation) to both elliptic 
integrals, giving 



f(r) = 2(1 + 3 — {jrip ^ 2EiVT) ~ (1 ~ r ^ K ^) - ^ - ^ 2 ( x + ^ r ) K ^)) 

= ^l +r )E{^)-{l-r)K{^)) (55) 

We now move to the representation of K(r) and E(r) as hypergeometric series, which are 
defined as follows (using the notation a n = a(a + 1) ■ ■ • (a + n — 1)). 

^^^Ekr" (56) 

n=0 

K(r) = (7r/2) 2J Fi(l/2, 1/2; l;r 2 ) , £(r) = (vr/2) 2 Fi(-l/2, 1/2; l;r 2 ) (57) 

This has the advantage that, by a transformation rule due to Gauss, we can rewrite f(r) 
as a single hypergeometric series. 



f(r) = ^ ((1 +r) 2 F 1 (-l/2, 1/2; l;r) - (1 - r) 2^(1/2, 1/2; l;r)) (58) 

= TrraFiC-lAl^^;^ (59) 

Returning to the original inequality, our task has been simplified to showing that 

g(r) = 2 F 1 (-l/2, 1/2; 2; rf > 1 - r ( 1 - ^ (60) 



16 



Evaluating g(r) at and 1 makes it clear that this is equivalent to showing that g(r) is 
concave for < r < 1, which would follow from showing the second derivative g"{r) to be 
negative in this region. From the rules governing differentiation of hypergeometric series, 
it is easy to show that 

9"(r) = 1 ( 2 *i(l/2, 3/2; 3; rf - 2 2 F X (-1/2, 1/2; 2; r) 2 F x (3/2, 5/2; 4; r)) (61) 
The following hypergeometric transformation allows this to be simplified. 



2 F 1 (a, b; c;r) = (1 - ^-"-^(c - a, c - 6; c; r) (62) 
/(r) = l((l-r)Vl(5/2,3/2;3;r) 2 (63) 
- 2(1 - rf 2 F X (5/2, 3/2; 2; r) 2 F X (3/2, 5/2; 4; r)) (64) 

We will show that 2 Fi (5/2, 3/2; 3; rf < 2 F X (5/2, 3/2; 2; r) 2 F l {5/2, 3/2; 4; r) for all positive 
r, implying that </'(r) is negative in this region. We write out the two hypergeometric series 
explicitly: 



2*1(5/2, 3/2; 3; r) 2 = £ ^ , where k n = ^ffl^r™(65) 

m,n=0 

2 F 1 (5/2,3/2;2;r) 2 F 1 (5/2,3/2;4;r) = £ ^ (66) 

m,n=0 

= v fefef-L.uam') (67) 

m,n=0 ' 



fc 2 , / 6 + 3m \ fe^ / 3(2 + n) 3(2 + m) \ 



3rn 3 rh\Q + 2m J 3™3™\2(3 + m) 2(3 + n) 

m=0 m,n=0 ' 



> y_^ + V = 2^(5/2, 3/2; 3; r) 2 (69) 

m=0 m,n=0 
m>n 

where elementary methods can be used to show that the bracketed last term in eqn. (|68|) 
is at least 2 for any non-negative m and n. This completes the proof of the lemma. □ 



C Lipschitz constants 

This appendix contains derivations of the Lipschitz constants of the functions used for the 
concentration of measure results. 
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Figure 2: Error in approximation to elliptic integral (|50|) for < r < 1. 



Lemma 14.81 Let S be an n x d matrix with \\S\\2 = 1, and define f(S) = ^\\S\\f r . Then 
the Lipschitz constant n of f satisfies n < 2. 

Proof. Let k = min(n, d). We have 

1/(5) -/(r)| \\\s\\l-\\s\\l\ 

V = sup -— — — = sup — — — (70) 

S,T IP — -i \\2 S,T — 1 \\2 

f \\S \\tr + \\T \\tr\ \\\S\\tr — \\S\\tr\ n 

= sup — — — (71) 

s,t \ n J \\S - T\\ 2 

^ I \\S\\tr + ll^lltrA \\S — T\\ tr , , 

< sup — — — 72 

S,T \ n J \\S - T\\ 2 

. Vk(\\S\\tr + ll^Hir) , , . 

< sup <2k/n<2 (73) 

S,T n 

The first inequality is a triangle inequality, and the second two are derived from 



» r = J>(S)< 

i=l 



i=l 



which in turn uses a Cauchy-Schwarz inequality. □ 

Lemma 15.41 Let S be a point on the nd- dimensional hypercube written down as an n x d 
{ — 1,1} -matrix, and let f(S) = ^||5||^ r . Then the Lipschitz constant n of f (with respect 
to the Hamming distance) satisfies n < 4/nd. 

Proof. The proof is very similar to that of Lemma 14.81 As before, let k = mm(n,d). We 
have 
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\f(S)-f(T)\ 1 | \\S\\l - \\S\\ 2 tr \ 

S,T d(<S, T) S)T n 2 d d(S", T) 

, / H'S'lltr + H^lltr A \\S — T\\ tr , 

- 3?l »»d Js^iu (76) 

2v^(||S|k < 4A:/n 2 d < (77) 

where, extending inequality ()74j) . we use \\S\\t r < V^H'S'lh < \/fc||5||i. □ 
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